Lexical Variability and Compositionality: Investigating Idiomaticity with Distributional Semantic Models
نویسندگان
چکیده
In this work we carried out an idiom type identification task on a set of 90 Italian V-NP and V-PP constructions comprising both idioms and non-idioms. Lexical variants were generated from these expressions by replacing their components with semantically related words extracted distributionally and from the Italian section of MultiWordNet. Idiomatic phrases turned out to be less similar to their lexical variants with respect to non-idiomatic ones in distributional semantic spaces. Different variant-based distributional measures of idiomaticity were tested. Our indices proved reliable in identifying also those idioms whose lexical variants are poorly or not at all attested in our corpus.
منابع مشابه
Exploring idiomaticity with variant-based distributional measures and Shannon’s entropy
The goal of this research is to investigate whether we can take advantage of the syntactic and lexical fixedness of idiomatic expressions to devise corpus-based indices of idiomaticity and compositionality and whether these measures can actually predict human ratings of idiom syntactic flexibility. First of all we describe a method for automatically distinguishing potential idioms from only lit...
متن کاملCombining Different Features of Idiomaticity for the Automatic Classification of Noun+Verb Expressions in Basque
We present an experimental study of how different features help measuring the idiomaticity of noun+verb (NV) expressions in Basque. After testing several techniques for quantifying the four basic properties of multiword expressions or MWEs (institutionalization, semantic non-compositionality, morphosyntactic fixedness and lexical fixedness), we test different combinations of them for classifica...
متن کاملMeasuring the compositionality of NV expressions in Basque by means of distributional similarity techniques
We present several experiments aiming at measuring the semantic compositionality of NV expressions in Basque. Our approach is based on the hypothesis that compositionality can be related to distributional similarity. The contexts of each NV expression are compared with the contexts of its corresponding components, by means of different techniques, as similarity measures usually used with the Ve...
متن کاملFusion of Compositional Network-based and Lexical Function Distributional Semantic Models
Distributional Semantic Models (DSMs) have been successful at modeling the meaning of individual words, with interest recently shifting to compositional structures, i.e., phrases and sentences. Network-based DSMs represent and handle semantics via operators applied on word neighborhoods, i.e., semantic graphs containing a target’s most similar words. We extend network-based DSMs to address comp...
متن کاملIntroducing PersPred, a Syntactic and Semantic Database for Persian Complex Predicates
This paper introduces PersPred, the first manually elaborated syntactic and semantic database for Persian Complex Predicates (CPs). Beside their theoretical interest, Persian CPs constitute an important challenge in Persian lexicography and for NLP. The first delivery, PersPred 11, contains 700 CPs, for which 22 fields of lexical, syntactic and semantic information are encoded. The semantic cla...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016